Search CORE

15 research outputs found

Learning to simplify sentences with quasi-synchronous grammar and integer programming

Author: Lapata M.
Woodsend K.
Publication venue
Publication date: 01/01/2011
Field of study

Semantic analysis for paraphrase identification using semantic role labeling

Author: Daniel B.
Kumar J. A.
Liu D.
Mihalcea R.
Sumathy K. L.
Woodsend K.
Yakushiji A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Reuse of documents has been prominently appeared during the course of digitalization of information contents owing to the wide-spread of internet and smartphones in various complex forms such as inserting words, omitting and substituting, changing word order, and etc. Especially, when a word in document is substituted with a similar word, it would be an issue not to consider it as a subject of measurement for the existing morphological similarity measurement method. In order to resolve this kind of problem, various researches have been conducted on the similarity measurement considering semantic information. This study is to propose a measurement method on semantic similarity being characterized as semantic role information in sentences acquired by semantic role labeling. To assess the performance of this proposed method, it was compared with the method of substring similarity being utilized for similarity measurement for existing documents. As a result, we could identify that the proposed method performed similar with the conventional method for the plagiarized documents which were rarely modified whereas it had improved results for paraphrasing sentences which were changed in structure

Crossref

University of Tasmania Open Access Repository

Conformal prediction of biological activity of chemical compounds

Author: A Gammerman
Alexander Gammerman
AN Jain
C-C Chang
DC Weis
EY Chang
F Pedregosa
G Shafer
Ilia Nouretdinov
J-L Faulon
K Woodsend
Paolo Toccaceli
V Monev
V Vovk
Y Wang
Y You
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2017
Field of study

Crossref

Royal Holloway - Pure

A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

Author: A Smeulders
C Chu
C Tsai
C Waring
DE Goldberg
G Huang
G Zanghirati
J Dean
J Venner
JK Suykens
K Woodsend
L Cao
LJ Cao
M Boutell
Man Qi
Maozhen Li
N Cristianini
Nasullah Khalid Alham
R Collobert
S Abe
S Keerthi
S Knerr
T Sikora
W Wong
Wenming Guo
Y Chen
Y Liu
Y Liu
Y Lu
Y Zhu-Hong
Yang Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2015
Field of study

Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments. The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classifications.National Basic Research Program (973) of China under Grant 2014CB34040

Crossref

Brunel University Research Archive

Exploiting separability in large-scale linear support vector machine training

Author: A. Altman
C.L. Lawson
D. Goldfarb
E. Dolan
E. Osuna
E.M. Gertz
J. Gondzio
J. Gondzio
J. Platt
J. Weston
Jacek Gondzio
K. Woodsend
Kristian Woodsend
M. Ferris
N. Cristianini
O.L. Mangasarian
R. Collobert
R. Herbrich
R.J. Vanderbei
S. Fine
S. Lucidi
S.J. Wright
S.S. Keerthi
T. Joachims
T. Joachims
V. Vapnik
V. Vapnik
V. Vapnik
W. Chu
Y.J. Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Linear support vector machine training can be represented as a large quadratic program. We present an efficient and numerically stable algorithm for this problem using interior point methods, which requires only O(n) operations per iteration. Through exploiting the separability of the Hessian, we provide a unified approach, from an optimization perspective, to 1-norm classification, 2-norm classification, universum classification, ordinal regression and ɛ-insensitive regression. Our approach has the added advantage of obtaining the hyperplane weights and bias directly from the solver. Numerical experiments indicate that, in contrast to existing methods, the algorithm is largely unaffected by noisy data, and they show training times for our implementation are consistent and highly competitive. We discuss the effect of using multiple correctors, and monitoring the angle of the normal to the hyperplane to determine termination

CiteSeerX

Crossref

Edinburgh Research Explorer